Improving the Performance of LVCSR Using Ensemble of Acoustic Models
نویسندگان
چکیده
Recent advances in Machine Learning have brought to attention new theories of learning as well as new approaches. Among these, the Ensemble method has received wide attention and has been shown to be a promising method for classification problems. Simply speaking, the ensemble method is a learning algorithm that constructs a set of “weak” classifiers and then combines their predictions to produce a more accurate classification. The underlying idea of the ensemble method is that the combination of diversified classifiers that have uncorrelated, and ideally complementary, error patterns can offer improved performance and a robust generalization capability. Given its successes for many classification problems, we began investigating the problem of adapting ensemble techniques to continuous speech recognition. Continuous Speech Recognition has been acknowledged as one of the most challenging tasks in classification. The performance of an ASR system is negatively impacted by a number of issues, such as corruption of noise, variability of speaker and speaking mode, change of environment conditions, transmission of channel, inaccuracy of model assumption, complexity of language, etc.. The primary goal of our research is to discover methods suitable for ensemble construction and combination that meet these special requirements of continuous speech recognition. We propose several novel ensemble-based acoustic model training and combination schemes, and test their effectiveness using real-world speech corpora. Preliminary results are described in this proposal, in particular • Utterance-level Boosting training algorithm for large scale acoustic modeling • Frame-level Boosting training algorithm using a Word Error Rate reduction criterion • N-Best list re-ranking and Rover combination to generate a better hypothesis Encouraging experimental results convince us that the ensemble technique is a promising method and that it has the potential to substantially improve the performance of a LVCSR system. However research on ensemble methods for speech recognition is still in its early stage and unsolved questions on ensemble generation and hypothesis combination remain to be addressed. This proposal sets out several key research topics that, if successfully addressed will have the potential to significantly increase the accuracy of ensemble-based speech recognition systems. These include the following: • Training criteria targeted at reducing Word Error Rate rather than Sentence Error Rate. • Integrating data manipulation and feature manipulation methods for continuous speech recognition. • Combination methods working on different objects, different levels and different decoding stages. • Ensemble-based semi-supervised acoustic model training algorithm using labeled and unlabeled data.
منابع مشابه
Improving Accuracy in Intrusion Detection Systems Using Classifier Ensemble and Clustering
Recently by developing the technology, the number of network-based servicesis increasing, and sensitive information of users is shared through the Internet.Accordingly, large-scale malicious attacks on computer networks could causesevere disruption to network services so cybersecurity turns to a major concern fornetworks. An intrusion detection system (IDS) could be cons...
متن کاملImpact of Layout Sequence of the Natural and Synthetic Adsorbents in Double-Layered Composites on Improving the Natural Fiber Acoustic Performance Using the Numerical Finite Element Method
Introduction: The acoustic performance of natural fiber adsorbents has been investigated in numerous studies. A part of these materials show a poor adsorption within the frequency range of less than 1000 Hz. In the present study, attempts were made to investigate the effect of layout sequence of double-layered composites consisting of natural and synthetic fibers on improving the acoustic adsor...
متن کاملImproving reservoir rock classification in heterogeneous carbonates using boosting and bagging strategies: A case study of early Triassic carbonates of coastal Fars, south Iran
An accurate reservoir characterization is a crucial task for the development of quantitative geological models and reservoir simulation. In the present research work, a novel view is presented on the reservoir characterization using the advantages of thin section image analysis and intelligent classification algorithms. The proposed methodology comprises three main steps. First, four classes of...
متن کاملImproving the performance of an LVCSR system through ensembles of acoustic models
This paper describes our work on applying ensembles of acoustic models to the problem of large vocabulary continuous speech recognition (LVCSR). We propose three algorithms for constructing ensembles. The first two have their roots in bagging algorithms; however, instead of randomly sampling examples our algorithms construct training sets based on the word error rate. The third one is a boostin...
متن کاملEnsemble of M5 Model Tree Based Modelling of Sodium Adsorption Ratio
This work reports the results of four ensemble approaches with the M5 model tree as the base regression model to anticipate Sodium Adsorption Ratio (SAR). Ensemble methods that combine the output of multiple regression models have been found to be more accurate than any of the individual models making up the ensemble. In this study additive boosting, bagging, rotation forest and random subspace...
متن کامل